Fast Algorithms for Determining Protein Structure Similarity
نویسندگان
چکیده
The problem of identifying the common three-dimensional structure between two protein molecules has received considerable attention from both the biology community and also from algorithms researchers. A number of similarity measures have been proposed so far for this purpose. Among them are the RMS distance, those based on geometric hashing, and some based on the contact map overlap. Very recently, a new measure called the bottleneck matching metric has been used as a measure of similarity between two drug or protein molecules. Although experimental studies have indicated the robustness of this metric, all the algorithms developed so far which are based on this suffer from running times which are high-degree polynomials in the number of atoms in the protein molecules, making them infeasible for practical applications. In this paper we show that by exploiting a very simple structural property of the α-Carbon backbone structures of proteins, the running time of some of these algorithms can be considerably improved. This can be further combined with some fairly standard algorithmic techniques such as randomization, and/or an approximate matching scheme for bipartite graphs. The resulting algorithms have running times which are nearly linear in the number of atoms in the proteins being compared, making the bottleneck matching measure a viable candidate for practical applications.
منابع مشابه
MINRMS: an efficient algorithm for determining protein structure similarity using root-mean-squared-distance
MOTIVATION Existing algorithms for automated protein structure alignment generate contradictory results and are difficult to interpret. An algorithm which can provide a context for interpreting the alignment and uses a simple method to characterize protein structure similarity is needed. RESULTS We describe a heuristic for limiting the search space for structure alignment comparisons between ...
متن کاملLink Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کاملAn improved opposition-based Crow Search Algorithm for Data Clustering
Data clustering is an ideal way of working with a huge amount of data and looking for a structure in the dataset. In other words, clustering is the classification of the same data; the similarity among the data in a cluster is maximum and the similarity among the data in the different clusters is minimal. The innovation of this paper is a clustering method based on the Crow Search Algorithm (CS...
متن کاملA partition-based algorithm for clustering large-scale software systems
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...
متن کاملFast overlapping of protein contact maps by alignment of eigenvectors
MOTIVATION Searching for structural similarity is a key issue of protein functional annotation. The maximum contact map overlap (CMO) is one of the possible measures of protein structure similarity. Exact and approximate methods known to optimize the CMO are computationally expensive and this hampers their applicability to large-scale comparison of protein structures. RESULTS In this article,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001